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ABSTRACT 

. , , methodological issues in the study of levels of 

knowledge are reviewed, and needs for further research are exptoref 

?n the'12" °' "P^^t^'^ the late iJtSs ' 

In the 12 studies 16 quantitative experiments were conducted. These 
were assessed for internal and external validity. Analysis revealed 
IZl -.^tudy design, some confounds in da a coHec !on 

externarCal-i^'''^"''"' procedures, and other issues rela ed o 
external validity. For example, in 11 of 12 experiments, subiects 
were not randomly selected. None of th 16 experiments nc^uied a 
discussion of data assumptions for the statistical proceS^res 

wa gWen^b^Sr^h ° "'^ experiments, no information 

S^vJ^T ^f °^ subjects assigned ,:o each group. 

Several other issues were found that might have affected validitv In 
addi ion. none of the studies reported the effect size of ^ ^I^n? 

t f r in more natural settings is required, and further 

study of the compensatory effects of prior knowledge i required 
Research into age differences and on the effects of prior instru;tion 
IS needed. Two tables present study data. (SLD) instruction 
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The development of cognitive psychology and the burgeoning of interest in 
students* knowledge organization have combined to offer insights to the reading 
process. Many cognitive theorists and researchers have demonstrated that readers' 
comprehension is enhanced if their preexisting knowledge is activated or if they are 
provided with opportunities to build background knowledge (Anderson, Spiro, & 
Anderson, 1978; Rumelhart & Ortony, 1977). 

Since the late 1970s, new trends have emerged in tiie study of the acquisition of 
knowledge. Some researchers have investigated tiie effects of prior knowledge upon 
learning and comprehension of the readers with different levels of knowledge about 
passage topics, witii different abilities, or witii different levels of expertise (Chi, 
Glaser, & Rees, 1982; Means & Voss, 1985; Recht & LesUe, 1988; Schneider & 
Korkel, 1989; Schneider , Korkel, & Weinert, 1989; SpiUch, Vesonder, Chiesi, & 
Voss, 1979; Stahl, Jacobson, Davis, & Davis, 1989). The results of tiiis research 
have shown that the extent of knowledge has die effect on the quantity as well as 
quality of students' understanding of the text. 

Efforts have also been made to explore tiie different levels of domain-specific 
knowledge and their relation to the acquisition of domain-specific knowledge. 
Research literature on levels of knowledge has revealed tiie supeiiority of high 
knowledge (HK) individuals (experts) over low knowledge (LK) individuals (novices) 
in certain aspects. HK individuals tend to outperform LK individuals by recalling 
more text information, providing rule-governed protocols (deep structures), engaging 
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in metacognitive processing and being more accurate in their solutions to the 
problems. 

There is, however, scarcity of critical review of the studies in which the 
significant results were found in the investigation of levels of domain knowledge. In 
particular, there is a need to examine the methodological rigor of the studies so that 
research consumers can employ the useful information to evaluate and judge the 
quality of research in this field. 

Therefore, it is the aim of this paper to address some of the methodological 
issues in the study of levels of knowledge and discuss the further research to be 
needed based on what has been found in the literature. 

Results of the StiiHy 

TTie following discussion is based on an analysis of 12 studies reported in 
journal articles since the late 1970s. In the 12 studies, 16 quantitative experiments 
were conducted. Critical criteria to assess the experimental studies developed by 
Lysynchuk, Pressley, d'Ailly, Smith, and Cake (1989) were used to evaluate the 
internal and external validity of the quantitative experiments. The analysis revealed a 
shortcoming in the design of the studies, some confounds in data collection, some 
threats to the statistical procedures, and issues related to the external validity. 

Insert Table 1 about Here 
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In general design, the most striking shortcoming is that in 11 out of 12 
e}q)eriments (about 92%) the subjects were not randomly selected. This obvious 
shortcoming certainly affected the generalizability of the results found in the studies. 
What is more important is that lack of randomization may cause systematic bias on 
certain variables which may have attributed to the significant treatment effects. 

In addition to lack of randomization in general design, components of the data 
collection process may have affected the results and conclusions of the studies we 
examined. There were some possible confounds that may threaten the internal validity 
of the studies. For example, 10 out of 16 experiments (about 63%) provided the 
information that subjects in all the groups were exposed to the experimental materials 
in the same amount of time. Only 9 out of 16 experiments (56%) reported the amount 
of time that subjects in each group spent on the dependent variable task. 

Shortcomings in data collection also include lack of manipulation checks and 
process measures. The former refers to checks to make sure whether the subjects 
perform the tasks as directed, which occurred in 4 out of 16 experiments (25%). The 
latter emphasizes relatively direct measure or processing in addition to outcome 
measures, which occurred in 3 out of 16 experiments (about 19%). In a fairly high 
rate of the experiments (8 out of 13, about 62%), the researchers did not include 
interrater reliability in scoring the subjects' recall protocols, the most common 
technique used in the study of levels of domain knowledge. 

In addition to threats to internal validity imposed by design and data collection, 
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issues related to statistical analysis may pose additional threats. The appropriateness 
of the statistical tests was determined by the following principles: (a) selection of the 
test statistic should logically fit the purpose of the study and best answer the questions 
that are being explored; (b) data assumptions of the selected test statistics should be 
met; (c) if assumpdons are violated, researchers should be aware of Jie issues 
involved and resort to alternatives to deal with the problem. In the 16 experiments we 
examined, none included a discussion about data assumptions for the statistical 
procedure that researchers chose to use. In 8 out of 16 experiments (50%), no 
information was provided about the number of subjects assigned to each group, 
whereas in 6 out of 13 experiments (about 46%), the data of standard deviations were 
not included in the report. Without the information of the standard deviations we can 
hardly determine the data assumptions that some test statistics are based on. For 
example, in one study, a number of pooled t-tests were used to assess the difference 
between good and poor learners on several measures. Equal variance of the groups is 
considered to be an important assumption for such a test. However, the researchers 
did not give any information about variances. In another study the analysis of 
covariance was used to process the data; however, there was no evidence on the part 
of the researchers indicating that they tested linear relation between the dependent 
measure and covariate, and homogeneity of the regression slopes. Also, in one study 
the researchers were not explicit about the kind of test statistic used to process the 
data. 
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External validity concerns about generalizability of the results obtained to other 
situations. Two broad types of external validity have been described: (a) population 
validity and (b) ecological validity (Huck, Cormier, & Bounds, 1974), Population 
validity involves generalization to other populations. Ecological validity involves 
generalization to environments similar to those of the experimental conditions. While 
threats to internal validity are functions of research design, threats to external validity 
are not. In part, external invalidity is a result of inadequate descriptions of subjects, 
independent, and dependent variables (Huck et al,, 1974), Since the late 1980s, there 
has been an increased interest in evaluating ecological validity in educational research. 
The same principles are applicable in the study of levels of domain knowledge. 



Insert Table 2 about here 



Table 2 summarizes the number (and percentage) of all studies that met a 
particular external validity criterion. A most disturbing problem is that researchers 
did not give ample description of the subjects they used in the experiments (13 out of 
16, 81%). Besides, most of the researchers (15 out 16, about 94%) were not 
concerned about levels of reading ability of the subjects and the difficulty level of the 
materials they used in the study (13 out of 16, about 81%). Besides, the biggest 
problems found in the review in terms of ecological validity are the laboratory 
treatment condition and highly contrived text used in the experiments. 
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In addition to the above issues, some other problems that arose in the studies 
need to be addressed. In one study, a disproportionate number of female subjects 
were grouped in the low knowledge group. It is suspected that sex difference may 
have been attributed to the difference found between high knowledge (HK) and low 
knowledge (LK) groups because most females might be less interested in baseball 
games. 

Another issue is about the use of large sample size found in two studies. The 
researchers could have included the effect size of the test so that research consumers 
can evaluate the significant findings from practical perspectives. By the way, none of 
the studies reported the effect size of the significant results. 

Conclusion 

This critique of the internal validity of reading comprehension research with 
different levels of knowledge by no means indicates that we are looking for "perfect" 
studies conducted in various settings. Instead, we argue that researchers should 
consider issues involved in statistical analyses when assumptions are not met, when 
unequal cell sizes are used in factorial design, and when "... a particular solution is 
selected on rational grounds so that those selections can be rationally described and 
defined" (Levin, 1985, p. 227). Statistical problems can be avoided by including 
information about means, standard deviations, and number of subjects. 

When assumptions underlying statistical procedures are not met, there arc 
alternative analysis strategies. In the case of repeated designs, when the assumption 
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of sphericity is not met, a univariate contrast approach might be appropriate (Levin, 
1985). When the assumption of homogeneity of regression slopes is not tenable, the 
Johnson-Neyman technique is a viable alternative (Stenvens, 1990). 

The extent to which results may be generalized to other populations depends in 
part on the degree of description of the sample participating in the study and the 
conditions under which the study is conducted. Researchers in comprehenrdon with 
different levels of knowledge could increase the external validity of thek work by 
providing more complete descriptions of dependent measures (inducing reliabilities), 
reading levels of materials, and reading abilities of students. External validity could 
improved further by including delayed measures. Determining long term effects of 
strategy instruction provides indications of internalization and may be useftil in 
informing future instruction. 

Future Research Needed 

First, research is needed to extend some findings in the study of levels of 
knowledge to more natural settings by using natural text in different knowledge 
domains. The findings in some studies were based on more controlled and artificial 
text (Spilich et al., 1979; Chiesi, Spilich, & Voss, 1979). The results may be 
different in a more natural setting with a more authentic text. 

Second, the efficacy of proving prior knowledge to compensate for inefficiency 
of low knowledge individuals and low aptitude is inconclusive. Further research is 
needed to specify where and in what situation the compensation does not occur. 
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Third, it remains a question to be answered by further research whether young 
experts could learn a more complex schema. There is an controversy over whether 
young experts would perform as older experts even if they could learn a more 
complex schema. 

Finally, there is scarcity of intervention studies that have investigated the effect 
of instruction of prior knowledge and learning strategies on students' learning and 
comprehension in knowledge domain. 
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Table 1 Ratio (and percentage) of studies diat met criteria for internal 
validity (adapted from Lysynchuk et al., 1989). 

Studies that 
met criterion 



Description of criterion 

General design 

Control group present 

Subjects were randomly assigned to 
conditions. 

Subject mortality was approximately 
equal in treatment and control 
conditions. 

Independent variables were 
explicitly described. 

Dependent variables were explicitiy 
described. 

Dependent measures had face 
validity. 

Hawthorne effects were unlikely. 

Conclusions followed logically from 
the data. 

Possible confounds 

Trained and control subjects 
exposed to same materials 



Ratio* 



% 



16/16 100 
1/12 8 



16/16 



16/16 

16/16 
16/16 



11/11 



100 



16/16 100 
16/16 100. 



100 

100 
100 



100 



15 



15 



Both trained and control subjects 10/16 63 

had equal time of exposure to 

materials. 

Information was provided about time 9/16 56 

on task for both control and 
trained subjects. 

The same experimenter provided 8/10 80 

treatment to all conditions 

Absence of additional confounds 6/16 38 

Measurement 

There were manipulation checks to 4/16 25 

determine that subjects did as 

instructed. 

Alternate forms were used with 1/1 100 

repeated dependent measures. 

There were no ceiling or floor 6/6 100 

effects. 

Dependent measures were reliable. 2/5 40 

Interrater reliabilities were 5/13 38 
reported. 

Regression to the mean could be 14/16 88 
ruled out. 

Statistics 

Probability of Type 1 error rate 16/16 100 

was controlled. 
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Unit of analysis was consistent 16/16 100 
with unit of treatment. 

Correlation coefficients were 3/3 100 
computed within groups. 

Data assumptions were discussed. 0/16 00 

Cell size was reported. 8/16 50 

Means were reported. 13/13 100 

Standard deviations were reported. 7/13 54 

Equal slopes treated in ANCOVA. 0/1 00 

Information was provided as to type 0/4 00 

of ANOVA in unbalanced factorial 

designs. 



♦Ratio applies to studies for which the criterion was applicable. That is, the 
denominator of the ratio could be less than 16 when the criterion was not 
applicable, or when insufficient information was available to judge whether the 
criterion had been met. 
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Table 2 Ratio (and percentage) of studies that met criteria for external 
validity (adapted from Lysynchuk et al., 1989) 



Studies that met 
criterion 

Description of criterion Ratio* % 

Questions in the study were 16/16 100 
motivated by a theoretical or 
research base. 

Characteristics of the sample were 3/16 19 
described. 

Characteristics of the 3/3 100 

standardization sample for measures 

used in the study were similar to 

the sample participated in the 

study. 

Information about the reading 1/16 6 
ability of subjects was given. 

Information about the readability 3/16 19 
of text was given. 

The study included a measure of 0/16 00 
delayed effects. 

♦Ratio applies to studies for which the criterion was applicable. That is, the 
denommator of the ratio could be less than 16 when the criterion was not 
applicable, or when insufficient information was available to judge whether the 
criterion had been met. 
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